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Abstract 

Echo hiding method boasts high watermark capacity but not 
high robustness. We propose a new echo hiding method 
taking advantage of both channels to replace the original 
single echo hiding algorithm. Then, we propose a new echo 
hiding algorithm combined with spread spectrum algorithm 
in Modulated Complex Lapped Transform domain. The 
experimental results show higher robustness compared with 
original echo hiding method. 
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Introduction 

Echo hiding method was described for the first time in 
[1]. An echo can be considered as a delayed version of 
the signal itself. The delay can be made small so that 
the echo is not audible. Many algorithms based on 
echo hiding have been developed, such as single echo, 
double echo, forward backward echo and time spread 
echo hiding [2, 3, 4, 5] and their improvements. Yousof 
Erfani et al. M proposed three methods based on single 
and double echo hiding. Foo Say Wei et al. l 7 l 
embedded watermark bit and 1 into different 
channels to achieve higher robustness. Duan et al. l 8 l 
introduced a developed cepstrum method which 
improved the accuracy by making use of the cepstral 
value of the original signal. Experimental results in [8] 
show that Echo hiding method has high watermark 
capacity without high robustness. 

The Modulated Lapped transform (MLT) is commonly 
used to implement block transform coding in video 
and audio compression. It allows for perfect 
reconstruction, has almost optimal performance for 
transform coding of a wide variety of signals with no 
blocking artifacts [9]. In [10] the modulated complex 
lapped transform (MCLT) was given as a simple 



extension to the MLT and preserved the advantages of 
it. After that, some research has been done to improve 
the speed of MCLT algorithm [11-15]. We use MCLT to 
improve the robustness of echo hiding method in [1] 

The outline of the paper is as follows. We propose a 
new echo hiding method for stereo signals in Section 2 
and combine MCLT with our echo hiding method to 
improve robustness. The evaluation results and 
discussions are presented in Section 3. Finally the 
conclusions are given in Section 4. 

Capacity Enlargement — Combining Echo 
Hiding with Spread Spectrum 

Echo hiding method boasts a relatively large data 
capacity. The spread spectrum method embeds 
watermarks in the frequency domain while the echo 
hiding method works in the time domain. What's more, 
both methods process audio sequences block by block. 
Based on [1], we propose a new stereo echo hiding 
method below. Then, we try to combine it with the 
spread spectrum method in MCLT domain to improve 
robustness. 

Our Improved Echo Hiding Method 

We propose a new echo hiding scheme for stereo 
signals. Both channels are used to encode a watermark 
bit with different delay values. To encode watermark 

bit , echoes with delay value d Q are embedded in 

the left channel and echoes with delay value d l are 
embedded in the right channel. To encode watermark 
bit 1, echoes with delay value d x are embedded in 

the left channel and echoes with delay value d Q are 
embedded in the right channel. 

When extracting the embedded watermark, we 
compare the difference of cepstrum value in two delay 
points of both channels. 

The embedding procedure in our method based on [1] 
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is as follows . 



Input: an stereo audio signal portion X l (jl) , echo bit C , initial echo amplitude tt , Yl varies from to 
M — 1, e 6 {0,1} / M is the number of samples in the block 



Watermark embedding procedure: 

1) Divide the stereo audio signal portion X-(jl) into left channel portion x'(/?) and 

M , 

X t (q) , p and q vary from to — 1 

2) Embed the echo bit into both channels of the block and get the resulted sequence ( see [1] ) 

y'(P) = x'(P) + a*x' i (p-d e ) 
y r i (<?) = *i (q) + a* x- (q - d,_ e ) 

3) Recombine the left portion and the right portion and get the watermarked audio 

portion y ( .(n) 

Output: an processed audio signal block y. (tl) 



The extracting procedure in our method is as follows. 



Input: an processed audio signal block y i (jl) 
Watermark extracting procedure: 

1) Divide the stereo audio signal portion y. (fl) into left channel portion y \ (p) and 

M 1 

y. {q) , p and q vary from to — 1 

2) Compute the cepstrum of both channels(see [1]) 

c l {p) = F-\\ogF{y l i {p)y ) 
c„(q) = F~ l dog F(y;(q))) 

3) Decide the echo hiding bit 6 

if c l (d )-c l (d 1 )>c r (d )-c r (d l ) 

1 if c / (J )-c ( (<i 1 )<c,.(<i )-c r (t/ 1 ) 



e = 



Output: extracted echo bit 6 

Table 1 Correct Bit Rate (CBR) and Signal-to-Noise-Ratio (SNR) of single echo hiding method and our proposed echo hiding method 



audio 
sequence 


single echo hiding[l] 


our proposed echo hiding 


CBR 


SNR 


CBR 


SNR 


clip 1-10 


76.09% 


16.008 


79.53% 


17.693 


clip 11-20 


75.78% 


17.540 


78.13% 


18.784 


clip 21-30 


76.72% 


18.653 


78.59% 


20.118 


clip 31-40 


75.16% 


20.927 


76.09% 


22.147 


clip 41-50 


74.22% 


17.406 


81.09% 


18.749 


clip 51-60 


80.16% 


15.564 


85.78% 


17.249 


clip 61-70 


84.69% 


20.686 


84.84% 


22.019 


average 


77.54% 


18.112 


80.58% 


19.537 



Table 2. CBRof single echo hiding method and our proposed echo hiding method under various attacks 



Attacks 


single echo hiding[l] 


our proposed echo hiding 


Re-sampling(22.05kHz) 


66.75% 


82.83% 


MP3 compression 


56.50% 


56.47% 


Noise attack 


62.70% 


65.11% 
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Low-pass filtering 


61.56% 


73.37% 


High-pass filtering 


63.39% 


64.15% 



We use Sound Quality Assessment Material (SQAM) 
audio[16] as test material to compare our echo hiding 
method with the original single echo hiding. 

Table 1 shows that our proposed method has higher 
extracting rate and better conceptual transparency 
compared to the single echo hiding method. Table 2 
shows that our proposed echo hding method is more 
robust than single echo hiding under various attacks 
except MP3 compression. 

It is necessary to point out that the comparison 
betwween single echo hiding and our proposed echo 
hiding are based on the same embedding capacity of 
one watermark bit in 2048 samples of two channels, 
the same decay rate of 0.5 and initial amplitude of 0.1 
and 0.08 respectively. Note that our method is 
computationally more complex than the method in [1]. 

Watermark Embedding Procedures of the Proposed 
Approach 

Next we will combine the echo hiding method 



proposed above the spread spectrum method in MCLT 
domain and get a new scheme that works both in the 
time domain and the transform domain.In the encoder 
side, we've tried two combining orders and find out 
that echoes embedded after the MCLT transform will 
result in better performance in robustness. In the 
decoder side, since both methods just read the audio 
sequence rather than modify it, the order doesn't 
matter. 

The embedding procedure is as follows with step 1-3 
comes from [17]. 

Watermark Extracting Procedures of the Proposed 
Approach 

The extracting procedure in is as follows with step 1-3 
comes from [17]. 

Let p ■ q denote the normalized inner product of 
vector p and q , i.e., p-q = N' 1 ^ p i q i 



Input: audio signal blocks X- (&) , magnitude change CI , permutated watermark chip chip^ (&) , k varies 
from to M — 1 S varies from to C — 1 , C is the number of all possible watermark characters, echo bit 
C,eG{0,l}, M is the number of samples in the block 
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Watermark embedding procedure: 




1) 


Compute the analysis window H a (fl) (see [10]) 




2) 


Perform MCLT transform on X- (k) and get 


MCLT coefficients 


3) 


Modify X. (fc) according to the corresponding watermark bit 

„.,,, \ X,(k)-a if chip s (k) = \ 
a ■ (k) = < / \ 

\X i (k) ■ (l/a) if chip s (k) = 


Chip(k) and get 


4) 


Perform inverse MCLT transform on X. (&) and get JC ; (Jt) 

x\(k) = Y^X' t (n)p,(k,n) 




5) 


Divide the stereo audio signal portion X i (Ji) into left channel portion x\ (/?) 

M 1 

x i w) ' P and 9 varies from to 1 


and 


6) 


Embed the echoes into both channels of the block(see [1]) and get the resulted sequence 




y l i (p) = x'(p) + a*x l i (p-d e ) 






yl (<?) = x. (q) + a* x. (q - d x _ e ) 




7) 


Recombine the left portion and the right portion and get the watermarked audio 
portion y t (k) 




Output: processed audio signal blocks y i (Ji) 



Input: watermarked audio signal blocks y. (&) , k varies from to M — 1 , M is the number of samples in 
the block 

Watermark extracting procedure: 

1) Compute the analysis window H s (tl) (see [10]) 

2) Perform MCLT transform on y • (fc) and get MCLT coefficients 

Y i (*) = Ho' y m (») a *) + S„T ^ (» ~m)p s (n, k) 

3) Compute the correlations of Y i (&) with all possible watermark chips in the pool and get the extracted 
watermark chips chip s (k) 

Correlatiai{Y i , chip s ) = Y i - chip s = Max{Y i • chip t } for all possible t 

4) Divide the stereo audio signal portion y. (K) into left channel portion yj(p) and y\(cf) , P 

M . 

and Q varies from to 1 

2 

5) Compute the cepstrum of both channels (see [1]) 

c, (p) = F~ 1 (log F(y\ (/?))) 

c r (q) = F- 1 (log F(y;(q))) 

6) Decide the echo hiding bit 6 

_ JO if c,(d Q )-c l (d l )>c r (d )-c r (d 1 ) 

[l if c,(d Q ) -c,(d x )<c r (d a ) -c r (d t ) 
Output: extracted watermark chip chip s (k) and echo bit C 
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Experiment Results 

We use Sound Quality Assessment Material (SQAM) 
audio [16] as test material, all 70 audio clips having a 
sampling frequency of 44100 Hz, 2 channels and a 
quantization of 16 bits. Various attacks are performed 
using Adobe Audition 3.0 and Audacity 1.3.6, which 
are both popular tool-sets for professional audio 
processing and editing. 

Transparency Evaluation 

SNR (Signal-to-Noise Ratio) [18] is a statistical 
difference metric which is used to measure the 
perceptual similarity between the undistorted original 
audio signal and the distorted watermarked audio 
signal. 

Robustness Evaluation 



authorship, the embedded watermark should 
withstand attemps at removing or damaging it. We 
have simulated five kinds of attemps, namely 
resampling, mp3 compression, white noise addition, 
low-pass filtering and high-pass filtering with audio 
processing software Adobe Audition 3.0 and Audacity, 
which can be freely downloaded from the Internet. The 
simulation results are listed below. 

Another important criterion of watermarking 
algorithms is their minimum error rate of watermark 
detection or extraction. Therefore, we have used the 
CBR (Correct Bit Rate). 

The correct bit rate in Table 7 are the averages of the 
correct bit rates obtained for 70 audio clips. The correct 
bit rate for each clip is defined as: 
Number of rightly extracted bits 



CBR = 



Number of embedded bits for the clip 



Since an extracted watermark is taken as a proof of 

TABLE 3 SNR OF ECHO HIDING AND ECHO HIDING WITH MCLT 



audio sequence 


SNR of our echo hiding method in 
Section 2.1 


SNR of echo hiding with MCLT 


clip 1-10 


17.693 


18.810 


clip 11-20 


18.784 


22.810 


clip 21-30 


20.118 


23.473 


clip 31-40 


22.147 


15.650 


clip 41-50 


18.749 


19.163 


clip 51-60 


17.249 


19.358 


clip 61-70 


22.019 


18.354 


average 


19.537 


19.660 



TABLE 4. CBR OF ECHO HIDING AND ECHO HIDING WITH MCLT UNDER VARIOUS ATTACKS 



Attacks 


Proposed echo hiding in Section 
2.1 


Our echo hiding with MCLT 


No attacks 


80.58% 


94.71% 


Re-sampling(22.05kHz) 


82.83% 


95.71% 


MP3 compression 


56.47% 


86.30% 


Noise attack 


65.11% 


77.86% 


Low-pass filtering 


73.37% 


92.92% 


High-pass filtering 


64.15% 


88.80% 



Various options of the attacks above are defined as 
follows: 

No attacks: closed loop(immediatedly decoding after 
encoding) 

Re-sampling: sampling the watermarked signal with 
22.05kHz sampling rate 

MP3 compression: compressing the watermarked signal 
by MPEG-1 layer 3 and reverting it again to the 
original wave file 

Noise attack: adding white noise with zero mean and 
Gaussian power density function to the watermarked 



signal 

Low-pass filtering: a first order low-pass filter with 
cut-off frequency 1600Hz is used 

High-pass filtering: a first order high-pass filter with 
cut-off frequency 1600Hz is used 

Table 3 and Table 4 show that combined with MCLT 
method, the robustness of echo hiding method is 
improved at the cost of lower embedding capacity and 
higher computational complexity. 

According to some previous work [8], the correct bit 
rate of the echo method is no higher than 83%, which 
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is verified by our own tests. We take advantage of the 
majority vote to lower the bit error rate. We embed all 
the watermark bits 2*k+l times and then extract them. 
If no less than k+1 bits corresponding to one bit is , 
we decide that the bit is . If no less than k+1 bits 
corresponding to one bit is 1 , we decide that the bit is 
1 . Our experimental results show that majority vote 
improves the bit correct rate and robustness of echo 
hiding method at the cost of lower capacity. 

Conclusions 

We first propose a new echo hiding method based on 
single echo hiding in [1], which is better than single 
echo hiding in both robustness and transparency with 
the same watermark capacity. Then, we have also 
improved the robustness of echo hiding method 
through combining it with the spread spectrum 
algorithm in MCLT domain [17]. Experimental results 
show that our two methods have high robustness. 
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